Problem Statement 1: Exploring Realtor dataset This dataset is from Kaggle and contains both categorical/discrete (nominal and ordinal) and numeric (continuous) variables scraped from www.realtor.com real estate website. The data has over 900K observations (houses) and 12 columns (various attributes of houses). The goal is to explore the price variable and find association between house attributes and its price.

  1. (1pt) Explore the overall structure of the dataset using “str” and “summary” functions.
# Let's read the data from the dataset

realtor = read.csv("/Users/nancy/Downloads/realtor-1.csv")
realtor
# Exploring the overall structure using str and summary functions as per the ques requirement:
str(realtor)
'data.frame':   923159 obs. of  12 variables:
 $ status      : chr  "for_sale" "for_sale" "for_sale" "for_sale" ...
 $ price       : num  105000 80000 67000 145000 65000 179000 50000 71600 100000 300000 ...
 $ bed         : num  3 4 2 4 6 4 3 3 2 5 ...
 $ bath        : num  2 2 1 2 2 3 1 2 1 3 ...
 $ acre_lot    : num  0.12 0.08 0.15 0.1 0.05 0.46 0.2 0.08 0.09 7.46 ...
 $ full_address: chr  "Sector Yahuecas Titulo # V84, Adjuntas, PR, 00601" "Km 78 9 Carr # 135, Adjuntas, PR, 00601" "556G 556-G 16 St, Juana Diaz, PR, 00795" "R5 Comunidad El Paraso Calle De Oro R-5 Ponce, Ponce, PR, 00731" ...
 $ street      : chr  "Sector Yahuecas Titulo # V84" "Km 78 9 Carr # 135" "556G 556-G 16 St" "R5 Comunidad El Paraso Calle De Oro R-5 Ponce" ...
 $ city        : chr  "Adjuntas" "Adjuntas" "Juana Diaz" "Ponce" ...
 $ state       : chr  "Puerto Rico" "Puerto Rico" "Puerto Rico" "Puerto Rico" ...
 $ zip_code    : num  601 601 795 731 680 612 639 731 730 670 ...
 $ house_size  : num  920 1527 748 1800 NA ...
 $ sold_date   : chr  "" "" "" "" ...
summary(realtor)
    status              price                bed              bath           acre_lot         full_address          street         
 Length:923159      Min.   :        0   Min.   :  1.00   Min.   :  1.00   Min.   :     0.00   Length:923159      Length:923159     
 Class :character   1st Qu.:   269000   1st Qu.:  2.00   1st Qu.:  1.00   1st Qu.:     0.11   Class :character   Class :character  
 Mode  :character   Median :   475000   Median :  3.00   Median :  2.00   Median :     0.29   Mode  :character   Mode  :character  
                    Mean   :   884123   Mean   :  3.33   Mean   :  2.49   Mean   :    17.08                                        
                    3rd Qu.:   839900   3rd Qu.:  4.00   3rd Qu.:  3.00   3rd Qu.:     1.15                                        
                    Max.   :875000000   Max.   :123.00   Max.   :198.00   Max.   :100000.00                                        
                    NA's   :71          NA's   :131703   NA's   :115192   NA's   :273623                                           
     city              state              zip_code       house_size       sold_date        
 Length:923159      Length:923159      Min.   :  601   Min.   :    100   Length:923159     
 Class :character   Class :character   1st Qu.: 2919   1st Qu.:   1130   Class :character  
 Mode  :character   Mode  :character   Median : 7004   Median :   1651   Mode  :character  
                                       Mean   : 6590   Mean   :   2142                     
                                       3rd Qu.:10001   3rd Qu.:   2499                     
                                       Max.   :99999   Max.   :1450112                     
                                       NA's   :205     NA's   :297843                      

According to the above summary, my personal deductions:

1. The total records are 923159.

2. These columns: price, bed, bath, acre_lot, zip_code, house_size have some missing values.

3. The minimum price for the house is 0, and the maximum price is 875,000,000.

  1. (1.5pt) Specify the type of each variable as follows: • Specify whether the variable is categorical(qualitative) or numeric(continuous)? • For qualitative variables, specify whether it is nominal or ordinal. • For numeric variables, specify whether it is discrete or continuous? • For discrete numeric variables specify whether it has interval scale (i.e., the difference between two values is meaningful) or not?
# We can deduce the type of variables with the str() function used above too, but we can also use the "class" function to specify if the variable is categorical or numerical.
class(realtor$status)
[1] "character"
class(realtor$price)
[1] "numeric"
class(realtor$bed)
[1] "numeric"
class(realtor$bath)
[1] "numeric"
class(realtor$acre_lot)
[1] "numeric"
class(realtor$full_address)
[1] "character"
class(realtor$street)
[1] "character"
class(realtor$city)
[1] "character"
class(realtor$state)
[1] "character"
class(realtor$zip_code)
[1] "numeric"
class(realtor$house_size)
[1] "numeric"
class(realtor$sold_date)
[1] "character"

So,We have overall 6 categorical variables:- status, full_address, street, city, state and sole_date.

And 6 numeric variables:- price, bed, bath, acre_lot, zip_code and house_size.

# To specify, if its nominal or ordinal in the qualitative variables.
is.ordered(realtor$status)
[1] FALSE
is.ordered(realtor$full_address)
[1] FALSE
is.ordered(realtor$street)
[1] FALSE
is.ordered(realtor$city)
[1] FALSE
is.ordered(realtor$state)
[1] FALSE
is.ordered(realtor$sole_date)
[1] FALSE

Therefore, we have all the qualitative variables as nominal because there is no order relevance in them.

# To understand if the numeric values are discrete or continuous, we can use the histogram plotting concept to visually reveal the distribution of values. Discrete variables might show distinct bars, while continuous variables may exhibit a smoother distribution.

hist(realtor$price, main = "Histogram")

hist(realtor$bed, main = "Histogram")

hist(realtor$bath, main = "Histogram")

hist(realtor$acre_lot, main = "Histogram")

hist(realtor$zip_code, main = "Histogram")

hist(realtor$house_size, main = "Histogram")

According to these plots, The variables that seem continuous are: Price, acre_lot, house_size

The discrete variables are: bed, bath, zip_code,

# We are trying a method - equidistant intervals, For an interval scale, the differences between consecutive values should be approximately equal and if not then there is not interval scale between them

# For bed variable:
unique_values <- unique(realtor$bed)
bed_diff <- diff(unique_values)
print(bed_diff)
 [1]   1  -2   4  -1  -4   8  NA  NA   1   4   1  -3   1  22  -9   4 -14   4   2  -4  -1   4  -2  23 -19  65 -55  -4  15  18 -38  10  67 -50 -20
[36]   1  -7  23 -10  32  55 -98  22
if (all(bed_diff == bed_diff[1])) {
  print("Interval scale.")
} else {
  print("No interval scale")
}
[1] "No interval scale"
# For bath variable:
unique_values <- unique(realtor$bath)
bath_diff <- diff(unique_values)
print(bath_diff)
 [1]   -1    2    2   -1    3   -1   NA   NA    1    1    2    1   22  -24    5   -1    3    2   -6   22  -11   -8    2   37  -14    9  -23  170
[29] -176   11   -6    3   -1   -5   22  -25  102  -84    4
if (all(bath_diff == bath_diff[1])) {
  print("Interval scale.")
} else {
  print("No interval scale")
}
[1] "No interval scale"
# For zip_code variable:
unique_values <- unique(realtor$zip_code)
zip_diff <- diff(unique_values)
print(zip_diff)
   [1]    194    -64    -51    -68     27     91    -60     -8      7    -28     57     30    -43    -75     66     93    -53    -60    110
  [20]    -49     40   -133     35    -57      1      3     54     -7    -30     59      1     -6    -10      7     13    -70     61     15
  [39]     34   -100  94373 -94354    303      4   -261     91    -64    -97     14    351   -350      1     NA     NA    -65    312   -326
  [58]      2    310    -89    -96   -130     14     54      5    252     -9   -350    181    -69     40    -27     44    169   -257    -89
  [77]    100    -17    277    -44   -137    185    -45      2   -223     51     24    178    -36      7   -190    229   -230      3    -25
  [96]      9    238     21    -25   -185    213    -26    -52      5     83   -288     84    -24    -49     53    -30    -15    245   -220
 [115]    150     10      7     52    -55     68    -58      3    -11     50   -222    234    -70   -174    223      2    -37    -13   -114
 [134]    172    -22    -17    -16     61    -20   -162   -162    143    -52     54     27    -22     50    -10     20     10     56     30
 [153]     65      1    339     -3   -305    268   -239    314   -225    -58    282   -355    -13      6     62    -40     69    -48    -34
 [172]      5    337   -324     33    -13    -55     67    -40      6    304   -289    320    -29    -13     20     28   -283    259   -273
 [191]    272   -285     19    -34    276   -245      4    281   -358     -4   5271    -41  -5129   4907     50    -26      2     32  -4966
 [210]   4921  -5001     77    -48    531   4488  -4958   4964  -4614   -339   -119      1     15      8     64    125   -193    223      3
 [229]   -179   5019  -5065    554   -338    -27   -170   5182   -144  -5005   2364   2637    -41     28  -4976   5136  -5195     45    -44
 [248]   5211  -4717   4916  -5376     -1    200   4790  -4825   4115  -4011   -114     18    -17    124   -249    100   4834  -4768   4796
 [267]  -4826    215   -195     -2   4843    -72  -4796   4860  -4845     14   4819    -20  -4794   4754      3    735  -5649   4986      2
 [286]  -4873   -116    132   4832  -4808   4769    -12  -4652   4714  -4845   4788  -4780   -141    438   -473    298    169    -62     70
 [305]   -451    -12   4982    -61     31     25    -55     11     79     -7    -75   2006  -2655   6661  -5533   -400  -4970      9   5122
 [324]  -4913   4900  -4869     68   4988   -306    -10    160   -235  -2563  -1949   1944   2608  -4695     60     77    -14   1950   1850
 [343]     53  -4014   4012    -91  -4021    322    -55     83   -150     97    -13     88     -8    -38     36    -62     70      1    -84
 [362]     80   -101     99     -2      3    -84     -2  28315 -28252      9    -32     49    -64     -3     -6   -106    178   3750  -3808
 [381]    -32   4724     36    512   -556     47  -4710   4664     24     23    -27     -8     15  -1061    162    897     17  -4707   4671
 [400]  -4766   5316   -674      3     -8    114     23   -143     -1     14    -16     11     -7    497   -135    -47   -313    -56     58
 [419]    346   6064     -5      1     12     17  -6507     30  -4803    -12  10763     12     94    -33    -30    124  -5389   5335    -24
 [438]    459  -5814      8  -5501  11255   -328  -5422   5449  -5425   5381    -30   -110     66  -5300   5331  -6057   6890   -805   -135
 [457]     19  -5241   5228   1428   -949   -334   -107  -5264   5342    -83  87942 -87863    -61  -6818   7559  -6062   5328  -5305   5286
 [476]  -5313   5430  -5427   5280    469   -441    473   -369  -5372    -14   5353    -43  -6749   7065   -290  -5337   5418  -5434   5290
 [495]    476  -6113      1   -379    443     12   -441    390  -1099   1095   -385    398   -383     43    313     49    -21      4    -17
 [514]  -1094   1099     38   6042     69  -5807   5760    -44     21     -8     57  -5814   5824     13    -27   -114     51     74  -7222
 [533]  -1894    -31     12      3     23   1877  -1894     15    -22      3      7      6      5    -22     14     -8   1708  -1701   1894
 [552]    -10     14    -10  -3925     11      1      2    315   1335  -1631    368   -101     21   -278    -13    114    -61     20    -72
 [571]    268     57    110   -132   1322  -1224   1210  -1598    390     25   -417     43    270     49   -314    230    123   -111     74
 [590]   1225  -1200   1198     11  -1600    390   -132    109    -85   -210    -63      5    402    -14   1209    -23     16    -31   -540
 [609]    562  -1353   -121    901   -781    102    650    633    -33    -22     79    171    -37   -162    -39    397   -383    385   -372
 [628]    379   -386    -16  -1275   1331    -49    -16     71    -57  -1484    -49    227      2   -179    -53    262  35982 -36039   -187
 [647]    223   -199    215    -43    691     31    -32     31   -561    -89    675   -716    721   -637   1235      8  -1277   -306    252
 [666]   1074     65      1   3524  -4891   1330  20477 -21589   1111    -19   3532    -91    -14    -10      8     72  -3522     95    -21
 [685]     33    -16      9      1     -4   -151    159    -81     14     41    -36   -803    649    158    -90     60     33   3490  -3512
 [704]   -804    -19     37     -3   -283   1095   -772    669     40   -772    878    -92    -10    111     -7    -50     60    -16    -15
 [723]     18     -1    -89     68   3354    138  -3549     -3   3548  -3481    -18    -11    -45   3566  -3552     54     17    -76      8
 [742]   3539  -3565     20   3530  -3590     53     50     26     18   -881    838   -821    819    -59   -764    715   -583    286   -318
 [761]     -4    301   -313      3     -5    -15     16     18    309     34    -33   -341     58    289   -266    -70    695   -762   -221
 [780]      2      2    310     60    257    -26   -540    237    -27    650   -685     53    340     -7   -359     33    337    -18   -580
 [799]    258   -271    247     31      6    -11   -253    263   -286    275    -19    348   -528    521      1     32   -375    -93    -77
 [818]    172    -31   -257     88    543   -397      5     52   -103    101      5     59    -58    -17   -281    295    -15    100   -101
 [837]     27    -10    -13    203   -356    408   -336    324     19   -543      2      1      5     20     53   1952      9  -1922   1896
 [856]  -1863   1841      7  -1864   1858  -1876   1860  -1873    -23     68      1    -61     59   1062    841    -66     23   -765    796
 [875]  -1923   1891  -1919   1120    816  -1888    -69      1      2      1      2      1    280     -4   -118    123   -146    300   -414
 [894]      1     14    225      2   -146     20    145   -140    -84    -21    207    -91    130    -21    181    -12     31   1492      8
 [913]    -26  -1465    -41     26     -5    -27    -31      1     80    -22    420    -13   -445     11    446   -455      6    387   -380
 [932]    383   -373     -1    418    -46   -351    -26    436    -75     13     11     19     -2   -205    232   -212     18    -44    206
 [951]   -167     -9   -200    197    180   -186    -19     98    108   -177    -26    111   -112     65     34     14     19   -128    115
 [970]   -120      4    134   -130     25    151   -162    191   -104    -90      2     91    -17    -55    100     -3     -8    -30    112
 [989]   -182     69    111   -113     29    -22      3     -2     31    -20     -8      4
 [ reached getOption("max.print") -- omitted 2191 entries ]
if (all(zip_diff == zip_diff[1])) {
  print("Interval scale.")
} else {
  print("No interval scale")
}
[1] "No interval scale"

According to my analysis, no discrete variables have any interval scale between them.

  1. (1pt) are there any duplicate observations in the data? If so, remove them. You can use “duplicated” or “unique” functions to answer this question. See an example here.
# Using the duplicated() function
realtor <- realtor[!duplicated(realtor), ]
print(realtor)
text <- "The number of duplicate rows are: "
print(paste(text,nrow(duplicate_rows)))
[1] "The number of duplicate rows are:  809370"

We have around 80,937 duplicacies, which we removed and got 1,13,789 entries.

  1. (0.5pt) Does any of the variables have missing values? Which ones?
summary(realtor)
    status              price                bed               bath            acre_lot         full_address          street         
 Length:113789      Min.   :        0   Min.   :  1.000   Min.   :  1.000   Min.   :     0.00   Length:113789      Length:113789     
 Class :character   1st Qu.:   250000   1st Qu.:  2.000   1st Qu.:  2.000   1st Qu.:     0.11   Class :character   Class :character  
 Mode  :character   Median :   449900   Median :  3.000   Median :  2.000   Median :     0.26   Mode  :character   Mode  :character  
                    Mean   :   909606   Mean   :  3.309   Mean   :  2.521   Mean   :    17.74                                        
                    3rd Qu.:   800000   3rd Qu.:  4.000   3rd Qu.:  3.000   3rd Qu.:     1.03                                        
                    Max.   :875000000   Max.   :123.000   Max.   :198.000   Max.   :100000.00                                        
                    NA's   :18          NA's   :17516     NA's   :16297     NA's   :31123                                            
     city              state              zip_code       house_size       sold_date        
 Length:113789      Length:113789      Min.   :  601   Min.   :    100   Length:113789     
 Class :character   Class :character   1st Qu.: 6010   1st Qu.:   1152   Class :character  
 Mode  :character   Mode  :character   Median : 8005   Median :   1664   Mode  :character  
                                       Mean   : 8267   Mean   :   2163                     
                                       3rd Qu.:10301   3rd Qu.:   2499                     
                                       Max.   :99999   Max.   :1450112                     
                                       NA's   :33      NA's   :36448                       

According to the summary function, we saw that the columns: price, bed, bath, acre_lot, zip_code, house_size have some missing values.

  1. (0.5pt) Remove all houses with price less than or equal to 50K
realtor <- realtor[realtor$price > 50000, ]
print(realtor)
  1. (1pt) The price variable appears to have some extreme values. Remove the outliers in the “price” variable using the IQR method. IQR and quantile functions throw error if you have NAs in your variable. Use na.rm=TRUE option inside IQR and quantile methods to ignore the missing price values.
# Let's calculate the IQR through the formula. As price has missing values, we'll use na.rm=TRUE
iqr <- IQR(realtor$price, na.rm = TRUE)
q1 <- quantile(realtor$price, 0.25, na.rm = TRUE) - 1.5 * iqr
q3 <- quantile(realtor$price, 0.75, na.rm = TRUE) + 1.5 * iqr
print(iqr)
[1] 560000
realtor <- realtor[!(realtor$price < q1 | realtor$price > q3), ]
print(realtor)
NA

Now we are left with 98,804 values in total in our dataframe.

  1. (0.5 pt) Draw a histogram and boxplot of the price. What can you say about the shape of price variable? Is the price variable positively skewed, symmetric, or negatively skewed?
hist(realtor$price, main = "The histogram for the price variable", xlab = "Price", ylab = "Frequency")

This is a right skewed graph

# The boxplot for this is:-
boxplot(realtor$price, main = "The boxplot for the price variable", ylab = "Price")

  1. (1pt) what percentage of the observations are missing for the price variable?
missing_observation <- mean(is.na(realtor$price)) * 100
print(missing_observation)
[1] 0.01821789
  1. (0.5pt) Use as.Date method to convert the sold_date variable from character string to date/time type. Then from this date/time object create two new attributes (sold_year) and (sold_month) to store the year and month that the house was sold (see an example here: https://statisticsglobe.com/extract-month-from-datein-r )
# Let's take the date format as Year-Month-date
realtor$sold_date <- as.Date(realtor$sold_date, format = "%Y-%m-%d")
print(realtor)

Here, we can see we have around 42823 missing values in sold_date and the data has been converted into date/time type

# Using the format function to extract the year and month from the sold_date attribute and created two new attributes to store it in the numeric form
realtor$sold_year <- as.numeric(format(realtor$sold_date, "%Y"))
realtor$sold_month <- as.numeric(format(realtor$sold_date, "%m"))
print(realtor)
  1. (0.5 pt) convert the “state” attribute to factor then take a summary to see how many observations are there for each state. Remove states with only one observation from the data.
realtor$state <- factor(realtor$state)
print(realtor)
summary(realtor$state)
   Connecticut       Delaware        Georgia          Maine  Massachusetts  New Hampshire     New Jersey       New York   Pennsylvania 
         12674           1262              5           4012           8673           3234          30363          21682           8513 
   Puerto Rico   Rhode Island        Vermont Virgin Islands       Virginia  West Virginia        Wyoming           NA's 
          2298           3249           2206            606              7              1              1             18 

For every state, we have a few observations, but there are some states like - West Virginia and Wyoming having only 1 observation so we’ll remove those.

# To check the number of observation for the West Virginia State
value <- realtor[realtor$state == "West Virginia", ]
print(value)

Let’s delete this row

realtor <- realtor[realtor$state != "West Virginia", ]
realtor <- realtor[realtor$state != "Wyoming", ]
print(realtor)
  1. (1pt) Is there a statistically significant difference between the average house price for different states? Use appropriate plot and statistic test to answer this question.

To understand the significance difference between the average house prices for different states, we use ANOVA Test

anova_res <- aov(realtor$price ~ realtor$state, data = realtor)
summary(anova_res)
                 Df    Sum Sq   Mean Sq F value Pr(>F)    
realtor$state    13 1.466e+15 1.128e+14    1064 <2e-16 ***
Residuals     98770 1.047e+16 1.060e+11                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
18 observations deleted due to missingness

Taking our Null hypothesis to be that there is no significant difference in the average house prices among different states. Accoording to our anova test results, where the p-value < 0.05 (chosen significance level), we’ll reject the null hypothesis indicating that there is a significant difference in the mean house prices among the states.

12.(1pt) What is the correlation between house_price and the variables sold_year, house_size, bed, and bath? Note: The “cor” function returns error when NAs are present in the variables. Set use=“pairwise.complete.obs” inside the “cor” function to ignore NAs when computing correlation coefficient between a pair of variables

correlation <- cor(realtor[, c("price", "sold_year", "house_size", "bed", "bath")],use="pairwise.complete.obs" )
print(correlation)
                  price    sold_year  house_size         bed       bath
price       1.000000000 -0.001095265  0.17842388  0.20480875  0.4169943
sold_year  -0.001095265  1.000000000 -0.03343316 -0.07504495 -0.0461646
house_size  0.178423878 -0.033433163  1.00000000  0.34031245  0.3486002
bed         0.204808748 -0.075044948  0.34031245  1.00000000  0.6441460
bath        0.416994304 -0.046164604  0.34860019  0.64414601  1.0000000

According to this matrix, we can see that:

1. Price has a strong positive correlation with bath and bed.

2. Sold_year has a weak correlation with Price

Problem2 — Exploring Heart Disease Dataset In this problem, you are going to explore the heartz disease dataset from UCI. This dataset contains 76 attributes but only 14 of them are relevant and used in publications. These 14 attributes are already processed and extracted from the dataset. Click on Data Folder and download the four processed datasets: processed.cleveland.data, processed.hungarian.data, processed.switzerland.data, processed.va.data.

  1. (0.5pt) Open these files and examine the data in them. Note that the files do not have a header and the missing values are marked by “?” character. Each file contains the 14 attributes described here. Load each file to a dataframe ( remember to set na.string=”?” so that “?” is recognized as missing not a data value).

Loading all the four datasets

cleveland = read.csv("/Users/nancy/Downloads/processed.cleveland.data", na.strings = "?", header = FALSE )
cleveland
hungarian = read.csv("/Users/nancy/Downloads/processed.hungarian.data",na.strings = "?", header = FALSE )
hungarian
switzerland = read.csv("/Users/nancy/Downloads/processed.switzerland.data", na.strings = "?", header = FALSE)
switzerland
va = read.csv("/Users/nancy/Downloads/processed.va.data",na.strings = "?", header = FALSE)
va
  1. (0.5 pt) Use rbind function to combine the four data frames into one dataframe and manually set the column names using colnames function. The name of each column/attribute is described here.
# Let's first combine these four data frames into one
heart_disease <- rbind(cleveland, hungarian, switzerland, va)
# Now, manually setting the 14 columns according to the document provided:
colnames(heart_disease) <- c("age", "sex", "cp", "trestbps", "chol", "fbs", "restecg", "thalach", "exang", "oldpeak", "slope", "ca", "thal", "num")
print(heart_disease) 
  1. (0.5pt) Explore the overall structure of the dataset. What percentage of rows have missing values in one or more attributes?
# Exploring the overall structure of the data set using summary function
summary(heart_disease)
      age             sex               cp          trestbps          chol            fbs            restecg          thalach     
 Min.   :28.00   Min.   :0.0000   Min.   :1.00   Min.   :  0.0   Min.   :  0.0   Min.   :0.0000   Min.   :0.0000   Min.   : 60.0  
 1st Qu.:47.00   1st Qu.:1.0000   1st Qu.:3.00   1st Qu.:120.0   1st Qu.:175.0   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:120.0  
 Median :54.00   Median :1.0000   Median :4.00   Median :130.0   Median :223.0   Median :0.0000   Median :0.0000   Median :140.0  
 Mean   :53.51   Mean   :0.7891   Mean   :3.25   Mean   :132.1   Mean   :199.1   Mean   :0.1663   Mean   :0.6046   Mean   :137.5  
 3rd Qu.:60.00   3rd Qu.:1.0000   3rd Qu.:4.00   3rd Qu.:140.0   3rd Qu.:268.0   3rd Qu.:0.0000   3rd Qu.:1.0000   3rd Qu.:157.0  
 Max.   :77.00   Max.   :1.0000   Max.   :4.00   Max.   :200.0   Max.   :603.0   Max.   :1.0000   Max.   :2.0000   Max.   :202.0  
                                                 NA's   :59      NA's   :30      NA's   :90       NA's   :2        NA's   :55     
     exang           oldpeak            slope             ca              thal            num        
 Min.   :0.0000   Min.   :-2.6000   Min.   :1.000   Min.   :0.0000   Min.   :3.000   Min.   :0.0000  
 1st Qu.:0.0000   1st Qu.: 0.0000   1st Qu.:1.000   1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:0.0000  
 Median :0.0000   Median : 0.5000   Median :2.000   Median :0.0000   Median :6.000   Median :1.0000  
 Mean   :0.3896   Mean   : 0.8788   Mean   :1.771   Mean   :0.6764   Mean   :5.088   Mean   :0.9957  
 3rd Qu.:1.0000   3rd Qu.: 1.5000   3rd Qu.:2.000   3rd Qu.:1.0000   3rd Qu.:7.000   3rd Qu.:2.0000  
 Max.   :1.0000   Max.   : 6.2000   Max.   :3.000   Max.   :3.0000   Max.   :7.000   Max.   :4.0000  
 NA's   :55       NA's   :62        NA's   :309     NA's   :611      NA's   :486                     
str(heart_disease)
'data.frame':   920 obs. of  14 variables:
 $ age     : num  63 67 67 37 41 56 62 57 63 53 ...
 $ sex     : num  1 1 1 1 0 1 0 0 1 1 ...
 $ cp      : num  1 4 4 3 2 2 4 4 4 4 ...
 $ trestbps: num  145 160 120 130 130 120 140 120 130 140 ...
 $ chol    : num  233 286 229 250 204 236 268 354 254 203 ...
 $ fbs     : num  1 0 0 0 0 0 0 0 0 1 ...
 $ restecg : num  2 2 2 0 2 0 2 0 2 2 ...
 $ thalach : num  150 108 129 187 172 178 160 163 147 155 ...
 $ exang   : num  0 1 1 0 0 0 0 1 0 1 ...
 $ oldpeak : num  2.3 1.5 2.6 3.5 1.4 0.8 3.6 0.6 1.4 3.1 ...
 $ slope   : num  3 2 2 3 1 1 3 1 2 3 ...
 $ ca      : num  0 3 2 0 0 0 2 0 1 0 ...
 $ thal    : num  6 3 7 3 3 3 3 3 7 7 ...
 $ num     : int  0 2 1 0 0 0 3 0 2 1 ...

We can see that except for these columns: age,sex,cp,num all have some missing values in them. Also, we have around 920 total entries.

# Finding number of rows having missing values in one or more attributes are:
missing_rows <- sum(rowSums(is.na(heart_disease)) > 0)
print(missing_rows)
[1] 621

Therefore, we have a total of around 621 missing rows values in one or more attributes.

# The % would be:
missing_rows_perc <- (missing_rows/920) * 100
print(missing_rows_perc)
[1] 67.5
  1. (2pt) Read the data description carefully. Specify the type of each variable as follows: • Specify whether the variable is categorical(qualitative) or numeric(continuous)? • For qualitative variables, specify whether it is nominal or ordinal. • For numeric variables, specify whether it is discrete or continuous? • For discrete numeric variables specify

So, After going through the document, the observations made are: 1. The variables: 1. sex 2. cp 3. fbs which is fasting blood sugar it can be greater than 120 or less hence creating two classes 4. restcg 5. exang 6. slope 7. thal Overall we have 7 attributes being categorical and the other 7 being numerical.

Talking about the qualitative variables being nominal or ordinal: So, after going through the description of the attributes Nominal Variables: 1. sex 2. cp, as the four types includes Value 1: typical angina, Value 2: atypical angina, Value 3: non-anginal pain, Value 4: asymptomatic. There is no particular order for this kind 3. fbs having two types of values; greater than 120 or less 4. exang

Ordinal Variables: 1. restcg, according to the three types there is a particular order being shown 2. slope 3. thal: having a number to specify an order

Talking about the numeric values being discrete or continuous: So, after going through the description of the attributes Discrete Variables: 1. age 2. ca 3. num, as mentioned in our description it seems categorical having two kinds of values in it

Continuous Variables: 1. trestbps 2. chol 3. thalach 4. oldpeak

print(heart_disease)
  1. (1pt) Convert all categorical variables to “factor” using factor function ( set the “labels” option to give meaningful names/labels to each level)
# using the factor function:
unique(heart_disease$restcg)
NULL
heart_disease$sex <- factor(heart_disease$sex, labels = c("F", "M"))
heart_disease$cp <- factor(heart_disease$cp, labels = c("Typical Angina", "Atypical Angina", "Non-anginal Pain", "Asymptomatic"))
heart_disease$fbs <- factor(heart_disease$fbs, labels = c("False", "True"))
heart_disease$exang <- factor(heart_disease$exang, labels = c("No", "Yes"))
heart_disease$slope <- factor(heart_disease$slope, labels = c("Upsloping", "Flat", "Downsloping"))
heart_disease$thal <- factor(heart_disease$thal, labels = c("Normal", "Fixed Defect", "Reversable Defect"))
print(heart_disease)
  1. (0.5 pt) What is the median and mode of the age attribute.
mode_age <- max(heart_disease$age)
median_age <- median(heart_disease$age)
print(paste("The mode is", mode_age))
[1] "The mode is 77"
print(paste("The median is", median_age))
[1] "The median is 54"
  1. (0.5 pt) Use “ifelse” and “factor” functions to create a new factor variable (call it “diagnosis”) which takes the value “No” if column 14 has the value zero and “Yes” otherwise. Replace column 14 of your dataframe with this new variable.
# column 14 is num
heart_disease$diagnosis <- factor(ifelse(heart_disease$num == 0, "No", "Yes"))
print(heart_disease)
# Replacing this column with new diagnosis column
heart_disease$num <- NULL
print(heart_disease)
  1. (5 pts) Explore the relationship between “diagnosis” variable you created above and all other 13 attributes in the dataset. Which variables are associated with “diagnosis” use appropriate plots and statistical tests to answer this question. Interpret the result of each test. (Note to get full credit for this question, you should use both an appropriate plot and an appropriate statistics test to examine the relationship between each of these variables and diagnosis. You should also interpret each plot and test statistics.

Ans:

# Finding the relationship of "diagnosis" variable with firstly the numeric variables:
# We'll use box plots and t-test 
# The numeric variables are: age, trestbps, chol, thalach, oldpeak, ca

boxplot(heart_disease$age ~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Age")

boxplot(heart_disease$trestbps ~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Trestbps")

boxplot(heart_disease$chol ~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Chol")

boxplot(heart_disease$thalach ~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Thalach")

boxplot(heart_disease$oldpeak~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Oldpeak")

boxplot(heart_disease$ca ~ heart_disease$diagnosis, xlab = "Diagnosis", ylab = "Ca")

# Performing a t-test now for the continuous variables onlywhich are: trestbps, chol, thalach, oldpeak

trestbps_t_test <- t.test(trestbps ~ diagnosis, data = heart_disease)
chol_t_test <- t.test(chol~ diagnosis, data = heart_disease)
thalach_t_test <- t.test(thalach ~ diagnosis, data = heart_disease)
oldpeak_t_test <- t.test(oldpeak ~ diagnosis, data = heart_disease)

print(trestbps_t_test)

    Welch Two Sample t-test

data:  trestbps by diagnosis
t = -3.1878, df = 858.85, p-value = 0.001485
alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
95 percent confidence interval:
 -6.56889 -1.56247
sample estimates:
 mean in group No mean in group Yes 
         129.9130          133.9787 
print(chol_t_test)

    Welch Two Sample t-test

data:  chol by diagnosis
t = 7.4756, df = 830.75, p-value = 1.951e-13
alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
95 percent confidence interval:
 37.92323 64.92815
sample estimates:
 mean in group No mean in group Yes 
         227.9056          176.4799 
print(thalach_t_test)

    Welch Two Sample t-test

data:  thalach by diagnosis
t = 12.633, df = 837.18, p-value < 2.2e-16
alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
95 percent confidence interval:
 17.34784 23.72998
sample estimates:
 mean in group No mean in group Yes 
         148.8005          128.2616 
print(oldpeak_t_test)

    Welch Two Sample t-test

data:  oldpeak by diagnosis
t = -12.763, df = 780.9, p-value < 2.2e-16
alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
95 percent confidence interval:
 -0.9742705 -0.7145329
sample estimates:
 mean in group No mean in group Yes 
        0.4182051         1.2626068 

All the p-values for all attributes are coming to be less that 0.05, thus, we’ll reject the null hypothesis and follow our alternate hypothesis that they are associated with the diagnosis variable.

# We'll perform Kruskal-Wallis test for ordinal numerical values: restcg, slope, thal
restecg_kt <- kruskal.test(restecg ~ diagnosis, data = heart_disease)
slope_kt <- kruskal.test(slope ~ diagnosis, data = heart_disease)
thal_kt <- kruskal.test(thal ~ diagnosis, data = heart_disease)
print(restecg_kt)

    Kruskal-Wallis rank sum test

data:  restecg by diagnosis
Kruskal-Wallis chi-squared = 5.4566, df = 1, p-value = 0.01949
print(slope_kt)

    Kruskal-Wallis rank sum test

data:  slope by diagnosis
Kruskal-Wallis chi-squared = 76.163, df = 1, p-value < 2.2e-16
print(thal_kt)

    Kruskal-Wallis rank sum test

data:  thal by diagnosis
Kruskal-Wallis chi-squared = 101.43, df = 1, p-value < 2.2e-16

Again, the p-values are less than 0.05. Therefore, we’ll reject our null hypothesis and can say that their is an association.

# Now, we'll use mosaic plots and chi-sqaure test for the categorical values
# The categorical values are: sex, cp, fbs, restcg, exang, slope, thal
# Constructing mosaic plots
mosaicplot(table(heart_disease$sex, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$cp, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$fbs, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$restecg, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$exang, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$slope, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

mosaicplot(table(heart_disease$thal, heart_disease$diagnosis), main = "Mosaic Plot", color = c("lightblue", "pink"))

# Now, we'll use chi-square test
sex_chisq <- chisq.test(table(heart_disease$sex, heart_disease$diagnosis))
cp_chisq <- chisq.test(table(heart_disease$cp, heart_disease$diagnosis))
fbs_chisq <- chisq.test(table(heart_disease$fbs, heart_disease$diagnosis))
restecg_chisq <- chisq.test(table(heart_disease$restecg, heart_disease$diagnosis))
exang_chisq <- chisq.test(table(heart_disease$exang, heart_disease$diagnosis))
slope_chisq <- chisq.test(table(heart_disease$slope, heart_disease$diagnosis))
thal_chisq <- chisq.test(table(heart_disease$thal, heart_disease$diagnosis))
print(sex_chisq)

    Pearson's Chi-squared test with Yates' continuity correction

data:  table(heart_disease$sex, heart_disease$diagnosis)
X-squared = 85.361, df = 1, p-value < 2.2e-16
print(cp_chisq)

    Pearson's Chi-squared test

data:  table(heart_disease$cp, heart_disease$diagnosis)
X-squared = 268.35, df = 3, p-value < 2.2e-16
print(fbs_chisq)

    Pearson's Chi-squared test with Yates' continuity correction

data:  table(heart_disease$fbs, heart_disease$diagnosis)
X-squared = 16.112, df = 1, p-value = 5.972e-05
print(restecg_chisq)

    Pearson's Chi-squared test

data:  table(heart_disease$restecg, heart_disease$diagnosis)
X-squared = 11.712, df = 2, p-value = 0.002863
print(exang_chisq)

    Pearson's Chi-squared test with Yates' continuity correction

data:  table(heart_disease$exang, heart_disease$diagnosis)
X-squared = 184.02, df = 1, p-value < 2.2e-16
print(slope_chisq)

    Pearson's Chi-squared test

data:  table(heart_disease$slope, heart_disease$diagnosis)
X-squared = 88.852, df = 2, p-value < 2.2e-16
print(thal_chisq )

    Pearson's Chi-squared test

data:  table(heart_disease$thal, heart_disease$diagnosis)
X-squared = 109.05, df = 2, p-value < 2.2e-16

According to the p-values coming, We can deduce that we should accept the alternate hypothesis.

LS0tCnRpdGxlOiAiTUwtIEFzc2lnbm1lbnQgMSIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKUHJvYmxlbSBTdGF0ZW1lbnQgMTogRXhwbG9yaW5nIFJlYWx0b3IgZGF0YXNldApUaGlzIGRhdGFzZXQgaXMgZnJvbSBLYWdnbGUgYW5kIGNvbnRhaW5zIGJvdGggY2F0ZWdvcmljYWwvZGlzY3JldGUgKG5vbWluYWwgYW5kIG9yZGluYWwpIGFuZCBudW1lcmljIChjb250aW51b3VzKSB2YXJpYWJsZXMgc2NyYXBlZCBmcm9tIHd3dy5yZWFsdG9yLmNvbSByZWFsIGVzdGF0ZSB3ZWJzaXRlLiBUaGUgZGF0YSBoYXMgb3ZlciA5MDBLIG9ic2VydmF0aW9ucyAoaG91c2VzKSBhbmQgMTIgY29sdW1ucyAodmFyaW91cyBhdHRyaWJ1dGVzIG9mIGhvdXNlcykuIFRoZSBnb2FsIGlzIHRvIGV4cGxvcmUgdGhlIHByaWNlIHZhcmlhYmxlIGFuZCBmaW5kIGFzc29jaWF0aW9uIGJldHdlZW4gaG91c2UgYXR0cmlidXRlcyBhbmQgaXRzIHByaWNlLgoKMS4gKDFwdCkgRXhwbG9yZSB0aGUgb3ZlcmFsbCBzdHJ1Y3R1cmUgb2YgdGhlIGRhdGFzZXQgdXNpbmcg4oCcc3Ry4oCdIGFuZCDigJxzdW1tYXJ54oCdIGZ1bmN0aW9ucy4KCmBgYHtyfQojIExldCdzIHJlYWQgdGhlIGRhdGEgZnJvbSB0aGUgZGF0YSBzZXQKcmVhbHRvciA9IHJlYWQuY3N2KCIvVXNlcnMvbmFuY3kvRG93bmxvYWRzL3JlYWx0b3ItMS5jc3YiKQpyZWFsdG9yCmBgYApgYGB7cn0KIyBFeHBsb3JpbmcgdGhlIG92ZXJhbGwgc3RydWN0dXJlIHVzaW5nIHN0ciBhbmQgc3VtbWFyeSBmdW5jdGlvbnMgYXMgcGVyIHRoZSBxdWVzIHJlcXVpcmVtZW50OgpzdHIocmVhbHRvcikKc3VtbWFyeShyZWFsdG9yKQpgYGAKCiMgQWNjb3JkaW5nIHRvIHRoZSBhYm92ZSBzdW1tYXJ5LCBteSBwZXJzb25hbCBkZWR1Y3Rpb25zOgojIDEuIFRoZSB0b3RhbCByZWNvcmRzIGFyZSA5MjMxNTkuCiMgMi4gVGhlc2UgY29sdW1uczogcHJpY2UsIGJlZCwgYmF0aCwgYWNyZV9sb3QsIHppcF9jb2RlLCBob3VzZV9zaXplIGhhdmUgc29tZSBtaXNzaW5nIHZhbHVlcy4KIyAzLiBUaGUgbWluaW11bSBwcmljZSBmb3IgdGhlIGhvdXNlIGlzIDAsIGFuZCB0aGUgbWF4aW11bSBwcmljZSBpcyA4NzUsMDAwLDAwMC4KCgoyLiAoMS41cHQpIFNwZWNpZnkgdGhlIHR5cGUgb2YgZWFjaCB2YXJpYWJsZSBhcyBmb2xsb3dzOgrigKIgU3BlY2lmeSB3aGV0aGVyIHRoZSB2YXJpYWJsZSBpcyBjYXRlZ29yaWNhbChxdWFsaXRhdGl2ZSkgb3IgbnVtZXJpYyhjb250aW51b3VzKT8K4oCiIEZvciBxdWFsaXRhdGl2ZSB2YXJpYWJsZXMsIHNwZWNpZnkgd2hldGhlciBpdCBpcyBub21pbmFsIG9yIG9yZGluYWwuCuKAoiBGb3IgbnVtZXJpYyB2YXJpYWJsZXMsIHNwZWNpZnkgd2hldGhlciBpdCBpcyBkaXNjcmV0ZSBvciBjb250aW51b3VzPwrigKIgRm9yIGRpc2NyZXRlIG51bWVyaWMgdmFyaWFibGVzIHNwZWNpZnkgd2hldGhlciBpdCBoYXMgaW50ZXJ2YWwgc2NhbGUgKGkuZS4sIHRoZSBkaWZmZXJlbmNlIGJldHdlZW4KdHdvIHZhbHVlcyBpcyBtZWFuaW5nZnVsKSBvciBub3Q/CgpgYGB7cn0KIyBXZSBjYW4gZGVkdWNlIHRoZSB0eXBlIG9mIHZhcmlhYmxlcyB3aXRoIHRoZSBzdHIoKSBmdW5jdGlvbiB1c2VkIGFib3ZlIHRvbywgYnV0IHdlIGNhbiBhbHNvIHVzZSB0aGUgImNsYXNzIiBmdW5jdGlvbiB0byBzcGVjaWZ5IGlmIHRoZSB2YXJpYWJsZSBpcyBjYXRlZ29yaWNhbCBvciBudW1lcmljYWwuCmNsYXNzKHJlYWx0b3Ikc3RhdHVzKQpjbGFzcyhyZWFsdG9yJHByaWNlKQpjbGFzcyhyZWFsdG9yJGJlZCkKY2xhc3MocmVhbHRvciRiYXRoKQpjbGFzcyhyZWFsdG9yJGFjcmVfbG90KQpjbGFzcyhyZWFsdG9yJGZ1bGxfYWRkcmVzcykKY2xhc3MocmVhbHRvciRzdHJlZXQpCmNsYXNzKHJlYWx0b3IkY2l0eSkKY2xhc3MocmVhbHRvciRzdGF0ZSkKY2xhc3MocmVhbHRvciR6aXBfY29kZSkKY2xhc3MocmVhbHRvciRob3VzZV9zaXplKQpjbGFzcyhyZWFsdG9yJHNvbGRfZGF0ZSkKYGBgCiMgU28sV2UgaGF2ZSBvdmVyYWxsIDYgY2F0ZWdvcmljYWwgdmFyaWFibGVzOi0gc3RhdHVzLCBmdWxsX2FkZHJlc3MsIHN0cmVldCwgY2l0eSwgc3RhdGUgYW5kIHNvbGVfZGF0ZS4gCiMgQW5kIDYgbnVtZXJpYyB2YXJpYWJsZXM6LSBwcmljZSwgYmVkLCBiYXRoLCBhY3JlX2xvdCwgemlwX2NvZGUgYW5kIGhvdXNlX3NpemUuCgpgYGB7cn0KIyBUbyBzcGVjaWZ5LCBpZiBpdHMgbm9taW5hbCBvciBvcmRpbmFsIGluIHRoZSBxdWFsaXRhdGl2ZSB2YXJpYWJsZXMuCmlzLm9yZGVyZWQocmVhbHRvciRzdGF0dXMpCmlzLm9yZGVyZWQocmVhbHRvciRmdWxsX2FkZHJlc3MpCmlzLm9yZGVyZWQocmVhbHRvciRzdHJlZXQpCmlzLm9yZGVyZWQocmVhbHRvciRjaXR5KQppcy5vcmRlcmVkKHJlYWx0b3Ikc3RhdGUpCmlzLm9yZGVyZWQocmVhbHRvciRzb2xlX2RhdGUpCmBgYAojIFRoZXJlZm9yZSwgd2UgaGF2ZSBhbGwgdGhlIHF1YWxpdGF0aXZlIHZhcmlhYmxlcyBhcyBub21pbmFsIGJlY2F1c2UgdGhlcmUgaXMgbm8gb3JkZXIgcmVsZXZhbmNlIGluIHRoZW0uCgoKYGBge3J9CiMgVG8gdW5kZXJzdGFuZCBpZiB0aGUgbnVtZXJpYyB2YWx1ZXMgYXJlIGRpc2NyZXRlIG9yIGNvbnRpbnVvdXMsIHdlIGNhbiB1c2UgdGhlIGhpc3RvZ3JhbSBwbG90dGluZyBjb25jZXB0IHRvIHZpc3VhbGx5IHJldmVhbCB0aGUgZGlzdHJpYnV0aW9uIG9mIHZhbHVlcy4gRGlzY3JldGUgdmFyaWFibGVzIG1pZ2h0IHNob3cgZGlzdGluY3QgYmFycywgd2hpbGUgY29udGludW91cyB2YXJpYWJsZXMgbWF5IGV4aGliaXQgYSBzbW9vdGhlciBkaXN0cmlidXRpb24uCgpoaXN0KHJlYWx0b3IkcHJpY2UsIG1haW4gPSAiSGlzdG9ncmFtIikKaGlzdChyZWFsdG9yJGJlZCwgbWFpbiA9ICJIaXN0b2dyYW0iKQpoaXN0KHJlYWx0b3IkYmF0aCwgbWFpbiA9ICJIaXN0b2dyYW0iKQpoaXN0KHJlYWx0b3IkYWNyZV9sb3QsIG1haW4gPSAiSGlzdG9ncmFtIikKaGlzdChyZWFsdG9yJHppcF9jb2RlLCBtYWluID0gIkhpc3RvZ3JhbSIpCmhpc3QocmVhbHRvciRob3VzZV9zaXplLCBtYWluID0gIkhpc3RvZ3JhbSIpCmBgYAojIEFjY29yZGluZyB0byB0aGVzZSBwbG90cywgVGhlIHZhcmlhYmxlcyB0aGF0IHNlZW0gY29udGludW91cyBhcmU6IFByaWNlLCBhY3JlX2xvdCwgaG91c2Vfc2l6ZQojIFRoZSBkaXNjcmV0ZSB2YXJpYWJsZXMgYXJlOiBiZWQsIGJhdGgsIHppcF9jb2RlLCAKCmBgYHtyfQojIFdlIGFyZSB0cnlpbmcgYSBtZXRob2QgLSBlcXVpZGlzdGFudCBpbnRlcnZhbHMsIEZvciBhbiBpbnRlcnZhbCBzY2FsZSwgdGhlIGRpZmZlcmVuY2VzIGJldHdlZW4gY29uc2VjdXRpdmUgdmFsdWVzIHNob3VsZCBiZSBhcHByb3hpbWF0ZWx5IGVxdWFsIGFuZCBpZiBub3QgdGhlbiB0aGVyZSBpcyBub3QgaW50ZXJ2YWwgc2NhbGUgYmV0d2VlbiB0aGVtCgojIEZvciBiZWQgdmFyaWFibGU6CnVuaXF1ZV92YWx1ZXMgPC0gdW5pcXVlKHJlYWx0b3IkYmVkKQpiZWRfZGlmZiA8LSBkaWZmKHVuaXF1ZV92YWx1ZXMpCnByaW50KGJlZF9kaWZmKQoKaWYgKGFsbChiZWRfZGlmZiA9PSBiZWRfZGlmZlsxXSkpIHsKICBwcmludCgiSW50ZXJ2YWwgc2NhbGUuIikKfSBlbHNlIHsKICBwcmludCgiTm8gaW50ZXJ2YWwgc2NhbGUiKQp9CgojIEZvciBiYXRoIHZhcmlhYmxlOgp1bmlxdWVfdmFsdWVzIDwtIHVuaXF1ZShyZWFsdG9yJGJhdGgpCmJhdGhfZGlmZiA8LSBkaWZmKHVuaXF1ZV92YWx1ZXMpCnByaW50KGJhdGhfZGlmZikKCmlmIChhbGwoYmF0aF9kaWZmID09IGJhdGhfZGlmZlsxXSkpIHsKICBwcmludCgiSW50ZXJ2YWwgc2NhbGUuIikKfSBlbHNlIHsKICBwcmludCgiTm8gaW50ZXJ2YWwgc2NhbGUiKQp9CgojIEZvciB6aXBfY29kZSB2YXJpYWJsZToKdW5pcXVlX3ZhbHVlcyA8LSB1bmlxdWUocmVhbHRvciR6aXBfY29kZSkKemlwX2RpZmYgPC0gZGlmZih1bmlxdWVfdmFsdWVzKQpwcmludCh6aXBfZGlmZikKCmlmIChhbGwoemlwX2RpZmYgPT0gemlwX2RpZmZbMV0pKSB7CiAgcHJpbnQoIkludGVydmFsIHNjYWxlLiIpCn0gZWxzZSB7CiAgcHJpbnQoIk5vIGludGVydmFsIHNjYWxlIikKfQoKYGBgCgojIEFjY29yZGluZyB0byBteSBhbmFseXNpcywgbm8gZGlzY3JldGUgdmFyaWFibGVzIGhhdmUgYW55IGludGVydmFsIHNjYWxlIGJldHdlZW4gdGhlbS4KCgozLiAoMXB0KSBhcmUgdGhlcmUgYW55IGR1cGxpY2F0ZSBvYnNlcnZhdGlvbnMgaW4gdGhlIGRhdGE/IElmIHNvLCByZW1vdmUgdGhlbS4gWW91IGNhbiB1c2Ug4oCcZHVwbGljYXRlZOKAnSBvcgrigJx1bmlxdWXigJ0gZnVuY3Rpb25zIHRvIGFuc3dlciB0aGlzIHF1ZXN0aW9uLiBTZWUgYW4gZXhhbXBsZSBoZXJlLgoKYGBge3J9CiMgVXNpbmcgdGhlIGR1cGxpY2F0ZWQoKSBmdW5jdGlvbgpyZWFsdG9yIDwtIHJlYWx0b3JbIWR1cGxpY2F0ZWQocmVhbHRvciksIF0KcHJpbnQocmVhbHRvcikKdGV4dCA8LSAiVGhlIG51bWJlciBvZiBkdXBsaWNhdGUgcm93cyBhcmU6ICIKcHJpbnQocGFzdGUodGV4dCxucm93KGR1cGxpY2F0ZV9yb3dzKSkpCmBgYAoKIyBXZSBoYXZlIGFyb3VuZCA4MCw5MzcgZHVwbGljYWNpZXMsIHdoaWNoIHdlIHJlbW92ZWQgYW5kIGdvdCAxLDEzLDc4OSBlbnRyaWVzLgoKCjQuICgwLjVwdCkgRG9lcyBhbnkgb2YgdGhlIHZhcmlhYmxlcyBoYXZlIG1pc3NpbmcgdmFsdWVzPyBXaGljaCBvbmVzPwpgYGB7cn0Kc3VtbWFyeShyZWFsdG9yKQpgYGAKCiMgQWNjb3JkaW5nIHRvIHRoZSBzdW1tYXJ5IGZ1bmN0aW9uLCB3ZSBzYXcgdGhhdCB0aGUgY29sdW1uczogcHJpY2UsIGJlZCwgYmF0aCwgYWNyZV9sb3QsIHppcF9jb2RlLCBob3VzZV9zaXplIGhhdmUgc29tZSBtaXNzaW5nIHZhbHVlcy4KCjUuICgwLjVwdCkgUmVtb3ZlIGFsbCBob3VzZXMgd2l0aCBwcmljZSBsZXNzIHRoYW4gb3IgZXF1YWwgdG8gNTBLCmBgYHtyfQpyZWFsdG9yIDwtIHJlYWx0b3JbcmVhbHRvciRwcmljZSA+IDUwMDAwLCBdCnByaW50KHJlYWx0b3IpCmBgYAoKNi4gKDFwdCkgVGhlIHByaWNlIHZhcmlhYmxlIGFwcGVhcnMgdG8gaGF2ZSBzb21lIGV4dHJlbWUgdmFsdWVzLiBSZW1vdmUgdGhlIG91dGxpZXJzIGluIHRoZSDigJxwcmljZeKAnSB2YXJpYWJsZQp1c2luZyB0aGUgSVFSIG1ldGhvZC4gSVFSIGFuZCBxdWFudGlsZSBmdW5jdGlvbnMgdGhyb3cgZXJyb3IgaWYgeW91IGhhdmUgTkFzIGluIHlvdXIgdmFyaWFibGUuIFVzZQpuYS5ybT1UUlVFIG9wdGlvbiBpbnNpZGUgSVFSIGFuZCBxdWFudGlsZSBtZXRob2RzIHRvIGlnbm9yZSB0aGUgbWlzc2luZyBwcmljZSB2YWx1ZXMuCmBgYHtyfQojIExldCdzIGNhbGN1bGF0ZSB0aGUgSVFSIHRocm91Z2ggdGhlIGZvcm11bGEuIEFzIHByaWNlIGhhcyBtaXNzaW5nIHZhbHVlcywgd2UnbGwgdXNlIG5hLnJtPVRSVUUKaXFyIDwtIElRUihyZWFsdG9yJHByaWNlLCBuYS5ybSA9IFRSVUUpCnExIDwtIHF1YW50aWxlKHJlYWx0b3IkcHJpY2UsIDAuMjUsIG5hLnJtID0gVFJVRSkgLSAxLjUgKiBpcXIKcTMgPC0gcXVhbnRpbGUocmVhbHRvciRwcmljZSwgMC43NSwgbmEucm0gPSBUUlVFKSArIDEuNSAqIGlxcgpwcmludChpcXIpCnJlYWx0b3IgPC0gcmVhbHRvclshKHJlYWx0b3IkcHJpY2UgPCBxMSB8IHJlYWx0b3IkcHJpY2UgPiBxMyksIF0KcHJpbnQocmVhbHRvcikKCmBgYAojIE5vdyB3ZSBhcmUgbGVmdCB3aXRoIDk4LDgwNCB2YWx1ZXMgaW4gdG90YWwgaW4gb3VyIGRhdGFmcmFtZS4KCgo3LiAoMC41IHB0KSBEcmF3IGEgaGlzdG9ncmFtIGFuZCBib3hwbG90IG9mIHRoZSBwcmljZS4gV2hhdCBjYW4geW91IHNheSBhYm91dCB0aGUgc2hhcGUgb2YgcHJpY2UKdmFyaWFibGU/IElzIHRoZSBwcmljZSB2YXJpYWJsZSBwb3NpdGl2ZWx5IHNrZXdlZCwgc3ltbWV0cmljLCBvciBuZWdhdGl2ZWx5IHNrZXdlZD8KCmBgYHtyfQpoaXN0KHJlYWx0b3IkcHJpY2UsIG1haW4gPSAiVGhlIGhpc3RvZ3JhbSBmb3IgdGhlIHByaWNlIHZhcmlhYmxlIiwgeGxhYiA9ICJQcmljZSIsIHlsYWIgPSAiRnJlcXVlbmN5IikKYGBgCiMgVGhpcyBpcyBhIHJpZ2h0IHNrZXdlZCBncmFwaAoKYGBge3J9CiMgVGhlIGJveHBsb3QgZm9yIHRoaXMgaXM6LQpib3hwbG90KHJlYWx0b3IkcHJpY2UsIG1haW4gPSAiVGhlIGJveHBsb3QgZm9yIHRoZSBwcmljZSB2YXJpYWJsZSIsIHlsYWIgPSAiUHJpY2UiKQpgYGAKCjguICgxcHQpIHdoYXQgcGVyY2VudGFnZSBvZiB0aGUgb2JzZXJ2YXRpb25zIGFyZSBtaXNzaW5nIGZvciB0aGUgcHJpY2UgdmFyaWFibGU/CgpgYGB7cn0KbWlzc2luZ19vYnNlcnZhdGlvbiA8LSBtZWFuKGlzLm5hKHJlYWx0b3IkcHJpY2UpKSAqIDEwMApwcmludChtaXNzaW5nX29ic2VydmF0aW9uKQpgYGAKCjkuICgwLjVwdCkgVXNlIGFzLkRhdGUgbWV0aG9kIHRvIGNvbnZlcnQgdGhlIHNvbGRfZGF0ZSB2YXJpYWJsZSBmcm9tIGNoYXJhY3RlciBzdHJpbmcgdG8gZGF0ZS90aW1lIHR5cGUuClRoZW4gZnJvbSB0aGlzIGRhdGUvdGltZSBvYmplY3QgY3JlYXRlIHR3byBuZXcgYXR0cmlidXRlcyAoc29sZF95ZWFyKSBhbmQgKHNvbGRfbW9udGgpIHRvIHN0b3JlIHRoZSB5ZWFyIGFuZAptb250aCB0aGF0IHRoZSBob3VzZSB3YXMgc29sZCAoc2VlIGFuIGV4YW1wbGUgaGVyZTogaHR0cHM6Ly9zdGF0aXN0aWNzZ2xvYmUuY29tL2V4dHJhY3QtbW9udGgtZnJvbS1kYXRlaW4tciApCgpgYGB7cn0KIyBMZXQncyB0YWtlIHRoZSBkYXRlIGZvcm1hdCBhcyBZZWFyLU1vbnRoLWRhdGUKcmVhbHRvciRzb2xkX2RhdGUgPC0gYXMuRGF0ZShyZWFsdG9yJHNvbGRfZGF0ZSwgZm9ybWF0ID0gIiVZLSVtLSVkIikKcHJpbnQocmVhbHRvcikKYGBgCiMgSGVyZSwgd2UgY2FuIHNlZSB3ZSBoYXZlIGFyb3VuZCA0MjgyMyBtaXNzaW5nIHZhbHVlcyBpbiBzb2xkX2RhdGUgYW5kIHRoZSBkYXRhIGhhcyBiZWVuIGNvbnZlcnRlZCBpbnRvIGRhdGUvdGltZSB0eXBlCmBgYHtyfQojIFVzaW5nIHRoZSBmb3JtYXQgZnVuY3Rpb24gdG8gZXh0cmFjdCB0aGUgeWVhciBhbmQgbW9udGggZnJvbSB0aGUgc29sZF9kYXRlIGF0dHJpYnV0ZSBhbmQgY3JlYXRlZCB0d28gbmV3IGF0dHJpYnV0ZXMgdG8gc3RvcmUgaXQgaW4gdGhlIG51bWVyaWMgZm9ybQpyZWFsdG9yJHNvbGRfeWVhciA8LSBhcy5udW1lcmljKGZvcm1hdChyZWFsdG9yJHNvbGRfZGF0ZSwgIiVZIikpCnJlYWx0b3Ikc29sZF9tb250aCA8LSBhcy5udW1lcmljKGZvcm1hdChyZWFsdG9yJHNvbGRfZGF0ZSwgIiVtIikpCnByaW50KHJlYWx0b3IpCmBgYAoKCjEwLiAoMC41IHB0KSBjb252ZXJ0IHRoZSDigJxzdGF0ZeKAnSBhdHRyaWJ1dGUgdG8gZmFjdG9yIHRoZW4gdGFrZSBhIHN1bW1hcnkgdG8gc2VlIGhvdyBtYW55IG9ic2VydmF0aW9ucyBhcmUKdGhlcmUgZm9yIGVhY2ggc3RhdGUuIFJlbW92ZSBzdGF0ZXMgd2l0aCBvbmx5IG9uZSBvYnNlcnZhdGlvbiBmcm9tIHRoZSBkYXRhLgpgYGB7cn0KcmVhbHRvciRzdGF0ZSA8LSBmYWN0b3IocmVhbHRvciRzdGF0ZSkKcHJpbnQocmVhbHRvcikKc3VtbWFyeShyZWFsdG9yJHN0YXRlKQpgYGAKIyBGb3IgZXZlcnkgc3RhdGUsIHdlIGhhdmUgYSBmZXcgb2JzZXJ2YXRpb25zLCBidXQgdGhlcmUgYXJlIHNvbWUgc3RhdGVzIGxpa2UgLSBXZXN0IFZpcmdpbmlhIGFuZCBXeW9taW5nIGhhdmluZyBvbmx5IDEgb2JzZXJ2YXRpb24gc28gd2UnbGwgcmVtb3ZlIHRob3NlLgpgYGB7cn0KIyBUbyBjaGVjayB0aGUgbnVtYmVyIG9mIG9ic2VydmF0aW9uIGZvciB0aGUgV2VzdCBWaXJnaW5pYSBTdGF0ZQp2YWx1ZSA8LSByZWFsdG9yW3JlYWx0b3Ikc3RhdGUgPT0gIldlc3QgVmlyZ2luaWEiLCBdCnByaW50KHZhbHVlKQpgYGAKIyBMZXQncyBkZWxldGUgdGhpcyByb3cgCmBgYHtyfQpyZWFsdG9yIDwtIHJlYWx0b3JbcmVhbHRvciRzdGF0ZSAhPSAiV2VzdCBWaXJnaW5pYSIsIF0KcmVhbHRvciA8LSByZWFsdG9yW3JlYWx0b3Ikc3RhdGUgIT0gIld5b21pbmciLCBdCnByaW50KHJlYWx0b3IpCmBgYAoKCjExLiAoMXB0KSBJcyB0aGVyZSBhIHN0YXRpc3RpY2FsbHkgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSBhdmVyYWdlIGhvdXNlIHByaWNlIGZvciBkaWZmZXJlbnQgc3RhdGVzPwpVc2UgYXBwcm9wcmlhdGUgcGxvdCBhbmQgc3RhdGlzdGljIHRlc3QgdG8gYW5zd2VyIHRoaXMgcXVlc3Rpb24uCgojIFRvIHVuZGVyc3RhbmQgdGhlIHNpZ25pZmljYW5jZSBkaWZmZXJlbmNlIGJldHdlZW4gdGhlIGF2ZXJhZ2UgaG91c2UgcHJpY2VzIGZvciBkaWZmZXJlbnQgc3RhdGVzLCB3ZSB1c2UgQU5PVkEgVGVzdApgYGB7cn0KYW5vdmFfcmVzIDwtIGFvdihyZWFsdG9yJHByaWNlIH4gcmVhbHRvciRzdGF0ZSwgZGF0YSA9IHJlYWx0b3IpCnN1bW1hcnkoYW5vdmFfcmVzKQpgYGAKIyBUYWtpbmcgb3VyIE51bGwgaHlwb3RoZXNpcyB0byBiZSB0aGF0IHRoZXJlIGlzIG5vIHNpZ25pZmljYW50IGRpZmZlcmVuY2UgaW4gdGhlIGF2ZXJhZ2UgaG91c2UgcHJpY2VzIGFtb25nIGRpZmZlcmVudCBzdGF0ZXMuIEFjY29vcmRpbmcgdG8gb3VyIGFub3ZhIHRlc3QgcmVzdWx0cywgd2hlcmUgdGhlIHAtdmFsdWUgPCAwLjA1IChjaG9zZW4gc2lnbmlmaWNhbmNlIGxldmVsKSwgd2UnbGwgcmVqZWN0IHRoZSBudWxsIGh5cG90aGVzaXMgaW5kaWNhdGluZyB0aGF0IHRoZXJlIGlzIGEgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBpbiB0aGUgbWVhbiBob3VzZSBwcmljZXMgYW1vbmcgdGhlIHN0YXRlcy4KCjEyLigxcHQpIFdoYXQgaXMgdGhlIGNvcnJlbGF0aW9uIGJldHdlZW4gaG91c2VfcHJpY2UgYW5kIHRoZSB2YXJpYWJsZXMgc29sZF95ZWFyLCBob3VzZV9zaXplLCBiZWQsIGFuZApiYXRoPyBOb3RlOiBUaGUg4oCcY29y4oCdIGZ1bmN0aW9uIHJldHVybnMgZXJyb3Igd2hlbiBOQXMgYXJlIHByZXNlbnQgaW4gdGhlIHZhcmlhYmxlcy4gU2V0CnVzZT0icGFpcndpc2UuY29tcGxldGUub2JzIiBpbnNpZGUgdGhlIOKAnGNvcuKAnSBmdW5jdGlvbiB0byBpZ25vcmUgTkFzIHdoZW4gY29tcHV0aW5nIGNvcnJlbGF0aW9uCmNvZWZmaWNpZW50IGJldHdlZW4gYSBwYWlyIG9mIHZhcmlhYmxlcwoKIyAKYGBge3J9CmNvcnJlbGF0aW9uIDwtIGNvcihyZWFsdG9yWywgYygicHJpY2UiLCAic29sZF95ZWFyIiwgImhvdXNlX3NpemUiLCAiYmVkIiwgImJhdGgiKV0sdXNlPSJwYWlyd2lzZS5jb21wbGV0ZS5vYnMiICkKcHJpbnQoY29ycmVsYXRpb24pCgpgYGAKIyBBY2NvcmRpbmcgdG8gdGhpcyBtYXRyaXgsIHdlIGNhbiBzZWUgdGhhdDoKIyAxLiBQcmljZSBoYXMgYSBzdHJvbmcgcG9zaXRpdmUgY29ycmVsYXRpb24gd2l0aCBiYXRoIGFuZCBiZWQuCiMgMi4gU29sZF95ZWFyIGhhcyBhIHdlYWsgY29ycmVsYXRpb24gd2l0aCBQcmljZQoKUHJvYmxlbTIg4oCUIEV4cGxvcmluZyBIZWFydCBEaXNlYXNlIERhdGFzZXQKSW4gdGhpcyBwcm9ibGVtLCB5b3UgYXJlIGdvaW5nIHRvIGV4cGxvcmUgdGhlIGhlYXJ0eiBkaXNlYXNlIGRhdGFzZXQgZnJvbSBVQ0kuIFRoaXMgZGF0YXNldCBjb250YWlucyA3NgphdHRyaWJ1dGVzIGJ1dCBvbmx5IDE0IG9mIHRoZW0gYXJlIHJlbGV2YW50IGFuZCB1c2VkIGluIHB1YmxpY2F0aW9ucy4gVGhlc2UgMTQgYXR0cmlidXRlcyBhcmUgYWxyZWFkeSBwcm9jZXNzZWQKYW5kIGV4dHJhY3RlZCBmcm9tIHRoZSBkYXRhc2V0LiBDbGljayBvbiBEYXRhIEZvbGRlciBhbmQgZG93bmxvYWQgdGhlIGZvdXIgcHJvY2Vzc2VkIGRhdGFzZXRzOgpwcm9jZXNzZWQuY2xldmVsYW5kLmRhdGEsIHByb2Nlc3NlZC5odW5nYXJpYW4uZGF0YSwgcHJvY2Vzc2VkLnN3aXR6ZXJsYW5kLmRhdGEsIHByb2Nlc3NlZC52YS5kYXRhLgoKMS4gKDAuNXB0KSBPcGVuIHRoZXNlIGZpbGVzIGFuZCBleGFtaW5lIHRoZSBkYXRhIGluIHRoZW0uIE5vdGUgdGhhdCB0aGUgZmlsZXMgZG8gbm90IGhhdmUgYSBoZWFkZXIgYW5kIHRoZQptaXNzaW5nIHZhbHVlcyBhcmUgbWFya2VkIGJ5IOKAnD/igJ0gY2hhcmFjdGVyLiBFYWNoIGZpbGUgY29udGFpbnMgdGhlIDE0IGF0dHJpYnV0ZXMgZGVzY3JpYmVkIGhlcmUuIExvYWQgZWFjaApmaWxlIHRvIGEgZGF0YWZyYW1lICggcmVtZW1iZXIgdG8gc2V0IG5hLnN0cmluZz3igJ0/4oCdIHNvIHRoYXQg4oCcP+KAnSBpcyByZWNvZ25pemVkIGFzIG1pc3Npbmcgbm90IGEgZGF0YSB2YWx1ZSkuCgojIExvYWRpbmcgYWxsIHRoZSBmb3VyIGRhdGFzZXRzCmBgYHtyfQpjbGV2ZWxhbmQgPSByZWFkLmNzdigiL1VzZXJzL25hbmN5L0Rvd25sb2Fkcy9wcm9jZXNzZWQuY2xldmVsYW5kLmRhdGEiLCBuYS5zdHJpbmdzID0gIj8iLCBoZWFkZXIgPSBGQUxTRSApCmNsZXZlbGFuZApgYGAKCgpgYGB7cn0KaHVuZ2FyaWFuID0gcmVhZC5jc3YoIi9Vc2Vycy9uYW5jeS9Eb3dubG9hZHMvcHJvY2Vzc2VkLmh1bmdhcmlhbi5kYXRhIixuYS5zdHJpbmdzID0gIj8iLCBoZWFkZXIgPSBGQUxTRSApCmh1bmdhcmlhbgpgYGAKCgpgYGB7cn0Kc3dpdHplcmxhbmQgPSByZWFkLmNzdigiL1VzZXJzL25hbmN5L0Rvd25sb2Fkcy9wcm9jZXNzZWQuc3dpdHplcmxhbmQuZGF0YSIsIG5hLnN0cmluZ3MgPSAiPyIsIGhlYWRlciA9IEZBTFNFKQpzd2l0emVybGFuZApgYGAKCgpgYGB7cn0KdmEgPSByZWFkLmNzdigiL1VzZXJzL25hbmN5L0Rvd25sb2Fkcy9wcm9jZXNzZWQudmEuZGF0YSIsbmEuc3RyaW5ncyA9ICI/IiwgaGVhZGVyID0gRkFMU0UpCnZhCmBgYAoyLiAoMC41IHB0KSBVc2UgcmJpbmQgZnVuY3Rpb24gdG8gY29tYmluZSB0aGUgZm91ciBkYXRhIGZyYW1lcyBpbnRvIG9uZSBkYXRhZnJhbWUgYW5kIG1hbnVhbGx5IHNldCB0aGUKY29sdW1uIG5hbWVzIHVzaW5nIGNvbG5hbWVzIGZ1bmN0aW9uLiBUaGUgbmFtZSBvZiBlYWNoIGNvbHVtbi9hdHRyaWJ1dGUgaXMgZGVzY3JpYmVkIGhlcmUuCgpgYGB7cn0KIyBMZXQncyBmaXJzdCBjb21iaW5lIHRoZXNlIGZvdXIgZGF0YSBmcmFtZXMgaW50byBvbmUKaGVhcnRfZGlzZWFzZSA8LSByYmluZChjbGV2ZWxhbmQsIGh1bmdhcmlhbiwgc3dpdHplcmxhbmQsIHZhKQojIE5vdywgbWFudWFsbHkgc2V0dGluZyB0aGUgMTQgY29sdW1ucyBhY2NvcmRpbmcgdG8gdGhlIGRvY3VtZW50IHByb3ZpZGVkOgpjb2xuYW1lcyhoZWFydF9kaXNlYXNlKSA8LSBjKCJhZ2UiLCAic2V4IiwgImNwIiwgInRyZXN0YnBzIiwgImNob2wiLCAiZmJzIiwgInJlc3RlY2ciLCAidGhhbGFjaCIsICJleGFuZyIsICJvbGRwZWFrIiwgInNsb3BlIiwgImNhIiwgInRoYWwiLCAibnVtIikKcHJpbnQoaGVhcnRfZGlzZWFzZSkgCmBgYAoKMy4gKDAuNXB0KSBFeHBsb3JlIHRoZSBvdmVyYWxsIHN0cnVjdHVyZSBvZiB0aGUgZGF0YXNldC4gV2hhdCBwZXJjZW50YWdlIG9mIHJvd3MgaGF2ZSBtaXNzaW5nIHZhbHVlcyBpbiBvbmUKb3IgbW9yZSBhdHRyaWJ1dGVzPwoKYGBge3J9CiMgRXhwbG9yaW5nIHRoZSBvdmVyYWxsIHN0cnVjdHVyZSBvZiB0aGUgZGF0YSBzZXQgdXNpbmcgc3VtbWFyeSBmdW5jdGlvbgpzdW1tYXJ5KGhlYXJ0X2Rpc2Vhc2UpCnN0cihoZWFydF9kaXNlYXNlKQpgYGAKIyBXZSBjYW4gc2VlIHRoYXQgZXhjZXB0IGZvciB0aGVzZSBjb2x1bW5zOiBhZ2Usc2V4LGNwLG51bSBhbGwgaGF2ZSBzb21lIG1pc3NpbmcgdmFsdWVzIGluIHRoZW0uIEFsc28sIHdlIGhhdmUgYXJvdW5kIDkyMCB0b3RhbCBlbnRyaWVzLgoKYGBge3J9CiMgRmluZGluZyBudW1iZXIgb2Ygcm93cyBoYXZpbmcgbWlzc2luZyB2YWx1ZXMgaW4gb25lIG9yIG1vcmUgYXR0cmlidXRlcyBhcmU6Cm1pc3Npbmdfcm93cyA8LSBzdW0ocm93U3Vtcyhpcy5uYShoZWFydF9kaXNlYXNlKSkgPiAwKQpwcmludChtaXNzaW5nX3Jvd3MpCmBgYAojIFRoZXJlZm9yZSwgd2UgaGF2ZSBhIHRvdGFsIG9mIGFyb3VuZCA2MjEgbWlzc2luZyByb3dzIHZhbHVlcyBpbiBvbmUgb3IgbW9yZSBhdHRyaWJ1dGVzLgpgYGB7cn0KIyBUaGUgJSB3b3VsZCBiZToKbWlzc2luZ19yb3dzX3BlcmMgPC0gKG1pc3Npbmdfcm93cy85MjApICogMTAwCnByaW50KG1pc3Npbmdfcm93c19wZXJjKQpgYGAKCjEzLiAoMnB0KSBSZWFkIHRoZSBkYXRhIGRlc2NyaXB0aW9uIGNhcmVmdWxseS4gU3BlY2lmeSB0aGUgdHlwZSBvZiBlYWNoIHZhcmlhYmxlIGFzIGZvbGxvd3M6CuKAoiBTcGVjaWZ5IHdoZXRoZXIgdGhlIHZhcmlhYmxlIGlzIGNhdGVnb3JpY2FsKHF1YWxpdGF0aXZlKSBvciBudW1lcmljKGNvbnRpbnVvdXMpPwrigKIgRm9yIHF1YWxpdGF0aXZlIHZhcmlhYmxlcywgc3BlY2lmeSB3aGV0aGVyIGl0IGlzIG5vbWluYWwgb3Igb3JkaW5hbC4K4oCiIEZvciBudW1lcmljIHZhcmlhYmxlcywgc3BlY2lmeSB3aGV0aGVyIGl0IGlzIGRpc2NyZXRlIG9yIGNvbnRpbnVvdXM/CuKAoiBGb3IgZGlzY3JldGUgbnVtZXJpYyB2YXJpYWJsZXMgc3BlY2lmeQoKU28sIEFmdGVyIGdvaW5nIHRocm91Z2ggdGhlIGRvY3VtZW50LCB0aGUgb2JzZXJ2YXRpb25zIG1hZGUgYXJlOgoxLiBUaGUgdmFyaWFibGVzOiAKICAxLiBzZXgKICAyLiBjcAogIDMuIGZicyB3aGljaCBpcyBmYXN0aW5nIGJsb29kIHN1Z2FyIGl0IGNhbiBiZSBncmVhdGVyIHRoYW4gMTIwIG9yIGxlc3MgaGVuY2UgY3JlYXRpbmcgdHdvIGNsYXNzZXMKICA0LiByZXN0Y2cKICA1LiBleGFuZyAKICA2LiBzbG9wZQogIDcuIHRoYWwKT3ZlcmFsbCB3ZSBoYXZlIDcgYXR0cmlidXRlcyBiZWluZyBjYXRlZ29yaWNhbCBhbmQgdGhlIG90aGVyIDcgYmVpbmcgbnVtZXJpY2FsLgoKVGFsa2luZyBhYm91dCB0aGUgcXVhbGl0YXRpdmUgdmFyaWFibGVzIGJlaW5nIG5vbWluYWwgb3Igb3JkaW5hbDoKU28sIGFmdGVyIGdvaW5nIHRocm91Z2ggdGhlIGRlc2NyaXB0aW9uIG9mIHRoZSBhdHRyaWJ1dGVzCk5vbWluYWwgVmFyaWFibGVzOgoxLiBzZXgKMi4gY3AsIGFzIHRoZSBmb3VyIHR5cGVzIGluY2x1ZGVzIFZhbHVlIDE6IHR5cGljYWwgYW5naW5hLCBWYWx1ZSAyOiBhdHlwaWNhbCBhbmdpbmEsIFZhbHVlIDM6IG5vbi1hbmdpbmFsIHBhaW4sIFZhbHVlIDQ6IGFzeW1wdG9tYXRpYy4gVGhlcmUgaXMgbm8gcGFydGljdWxhciBvcmRlciBmb3IgdGhpcyBraW5kCjMuIGZicyBoYXZpbmcgdHdvIHR5cGVzIG9mIHZhbHVlczsgZ3JlYXRlciB0aGFuIDEyMCBvciBsZXNzCjQuIGV4YW5nCgpPcmRpbmFsIFZhcmlhYmxlczoKMS4gcmVzdGNnLCBhY2NvcmRpbmcgdG8gdGhlIHRocmVlIHR5cGVzIHRoZXJlIGlzIGEgcGFydGljdWxhciBvcmRlciBiZWluZyBzaG93biAKMi4gc2xvcGUKMy4gdGhhbDogaGF2aW5nIGEgbnVtYmVyIHRvIHNwZWNpZnkgYW4gb3JkZXIKClRhbGtpbmcgYWJvdXQgdGhlIG51bWVyaWMgdmFsdWVzIGJlaW5nIGRpc2NyZXRlIG9yIGNvbnRpbnVvdXM6ClNvLCBhZnRlciBnb2luZyB0aHJvdWdoIHRoZSBkZXNjcmlwdGlvbiBvZiB0aGUgYXR0cmlidXRlcwpEaXNjcmV0ZSBWYXJpYWJsZXM6CjEuIGFnZQoyLiBjYQozLiBudW0sIGFzIG1lbnRpb25lZCBpbiBvdXIgZGVzY3JpcHRpb24gaXQgc2VlbXMgY2F0ZWdvcmljYWwgaGF2aW5nIHR3byBraW5kcyBvZiB2YWx1ZXMgaW4gaXQKCkNvbnRpbnVvdXMgVmFyaWFibGVzOgoxLiB0cmVzdGJwcwoyLiBjaG9sCjMuIHRoYWxhY2gKNC4gb2xkcGVhawpgYGB7cn0KcHJpbnQoaGVhcnRfZGlzZWFzZSkKYGBgCgo0LiAoMXB0KSBDb252ZXJ0IGFsbCBjYXRlZ29yaWNhbCB2YXJpYWJsZXMgdG8g4oCcZmFjdG9y4oCdIHVzaW5nIGZhY3RvciBmdW5jdGlvbiAoIHNldCB0aGUg4oCcbGFiZWxz4oCdIG9wdGlvbiB0bwpnaXZlIG1lYW5pbmdmdWwgbmFtZXMvbGFiZWxzIHRvIGVhY2ggbGV2ZWwpCmBgYHtyfQojIHVzaW5nIHRoZSBmYWN0b3IgZnVuY3Rpb246CnVuaXF1ZShoZWFydF9kaXNlYXNlJHJlc3RjZykKaGVhcnRfZGlzZWFzZSRzZXggPC0gZmFjdG9yKGhlYXJ0X2Rpc2Vhc2Ukc2V4LCBsYWJlbHMgPSBjKCJGIiwgIk0iKSkKaGVhcnRfZGlzZWFzZSRjcCA8LSBmYWN0b3IoaGVhcnRfZGlzZWFzZSRjcCwgbGFiZWxzID0gYygiVHlwaWNhbCBBbmdpbmEiLCAiQXR5cGljYWwgQW5naW5hIiwgIk5vbi1hbmdpbmFsIFBhaW4iLCAiQXN5bXB0b21hdGljIikpCmhlYXJ0X2Rpc2Vhc2UkZmJzIDwtIGZhY3RvcihoZWFydF9kaXNlYXNlJGZicywgbGFiZWxzID0gYygiRmFsc2UiLCAiVHJ1ZSIpKQpoZWFydF9kaXNlYXNlJGV4YW5nIDwtIGZhY3RvcihoZWFydF9kaXNlYXNlJGV4YW5nLCBsYWJlbHMgPSBjKCJObyIsICJZZXMiKSkKaGVhcnRfZGlzZWFzZSRzbG9wZSA8LSBmYWN0b3IoaGVhcnRfZGlzZWFzZSRzbG9wZSwgbGFiZWxzID0gYygiVXBzbG9waW5nIiwgIkZsYXQiLCAiRG93bnNsb3BpbmciKSkKaGVhcnRfZGlzZWFzZSR0aGFsIDwtIGZhY3RvcihoZWFydF9kaXNlYXNlJHRoYWwsIGxhYmVscyA9IGMoIk5vcm1hbCIsICJGaXhlZCBEZWZlY3QiLCAiUmV2ZXJzYWJsZSBEZWZlY3QiKSkKcHJpbnQoaGVhcnRfZGlzZWFzZSkKYGBgCjUuICgwLjUgcHQpIFdoYXQgaXMgdGhlIG1lZGlhbiBhbmQgbW9kZSBvZiB0aGUgYWdlIGF0dHJpYnV0ZS4KYGBge3J9Cm1vZGVfYWdlIDwtIG1heChoZWFydF9kaXNlYXNlJGFnZSkKbWVkaWFuX2FnZSA8LSBtZWRpYW4oaGVhcnRfZGlzZWFzZSRhZ2UpCnByaW50KHBhc3RlKCJUaGUgbW9kZSBpcyIsIG1vZGVfYWdlKSkKcHJpbnQocGFzdGUoIlRoZSBtZWRpYW4gaXMiLCBtZWRpYW5fYWdlKSkKYGBgCgo2LiAoMC41IHB0KSBVc2Ug4oCcaWZlbHNl4oCdIGFuZCDigJxmYWN0b3LigJ0gZnVuY3Rpb25zIHRvIGNyZWF0ZSBhIG5ldyBmYWN0b3IgdmFyaWFibGUgKGNhbGwgaXQg4oCcZGlhZ25vc2lz4oCdKSB3aGljaAp0YWtlcyB0aGUgdmFsdWUg4oCcTm/igJ0gaWYgY29sdW1uIDE0IGhhcyB0aGUgdmFsdWUgemVybyBhbmQg4oCcWWVz4oCdIG90aGVyd2lzZS4gUmVwbGFjZSBjb2x1bW4gMTQgb2YgeW91cgpkYXRhZnJhbWUgd2l0aCB0aGlzIG5ldyB2YXJpYWJsZS4KCmBgYHtyfQojIGNvbHVtbiAxNCBpcyBudW0KaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMgPC0gZmFjdG9yKGlmZWxzZShoZWFydF9kaXNlYXNlJG51bSA9PSAwLCAiTm8iLCAiWWVzIikpCnByaW50KGhlYXJ0X2Rpc2Vhc2UpCgpgYGAKYGBge3J9CiMgUmVwbGFjaW5nIHRoaXMgY29sdW1uIHdpdGggbmV3IGRpYWdub3NpcyBjb2x1bW4KaGVhcnRfZGlzZWFzZSRudW0gPC0gTlVMTApwcmludChoZWFydF9kaXNlYXNlKQpgYGAKNy4gKDUgcHRzKSBFeHBsb3JlIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiDigJxkaWFnbm9zaXPigJ0gdmFyaWFibGUgeW91IGNyZWF0ZWQgYWJvdmUgYW5kIGFsbCBvdGhlciAxMwphdHRyaWJ1dGVzIGluIHRoZSBkYXRhc2V0LiBXaGljaCB2YXJpYWJsZXMgYXJlIGFzc29jaWF0ZWQgd2l0aCDigJxkaWFnbm9zaXPigJ0gdXNlIGFwcHJvcHJpYXRlIHBsb3RzIGFuZApzdGF0aXN0aWNhbCB0ZXN0cyB0byBhbnN3ZXIgdGhpcyBxdWVzdGlvbi4gSW50ZXJwcmV0IHRoZSByZXN1bHQgb2YgZWFjaCB0ZXN0LiAoTm90ZSB0byBnZXQgZnVsbCBjcmVkaXQgZm9yIHRoaXMKcXVlc3Rpb24sIHlvdSBzaG91bGQgdXNlIGJvdGggYW4gYXBwcm9wcmlhdGUgcGxvdCBhbmQgYW4gYXBwcm9wcmlhdGUgc3RhdGlzdGljcyB0ZXN0IHRvIGV4YW1pbmUgdGhlCnJlbGF0aW9uc2hpcCBiZXR3ZWVuIGVhY2ggb2YgdGhlc2UgdmFyaWFibGVzIGFuZCBkaWFnbm9zaXMuIFlvdSBzaG91bGQgYWxzbyBpbnRlcnByZXQgZWFjaCBwbG90IGFuZCB0ZXN0IHN0YXRpc3RpY3MuCgoKQW5zOiAKYGBge3J9CiMgRmluZGluZyB0aGUgcmVsYXRpb25zaGlwIG9mICJkaWFnbm9zaXMiIHZhcmlhYmxlIHdpdGggZmlyc3RseSB0aGUgbnVtZXJpYyB2YXJpYWJsZXM6CiMgV2UnbGwgdXNlIGJveCBwbG90cwojIFRoZSBudW1lcmljIHZhcmlhYmxlcyBhcmU6IGFnZSwgdHJlc3RicHMsIGNob2wsIHRoYWxhY2gsIG9sZHBlYWssIGNhCgpib3hwbG90KGhlYXJ0X2Rpc2Vhc2UkYWdlIH4gaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMsIHhsYWIgPSAiRGlhZ25vc2lzIiwgeWxhYiA9ICJBZ2UiKQpib3hwbG90KGhlYXJ0X2Rpc2Vhc2UkdHJlc3RicHMgfiBoZWFydF9kaXNlYXNlJGRpYWdub3NpcywgeGxhYiA9ICJEaWFnbm9zaXMiLCB5bGFiID0gIlRyZXN0YnBzIikKYm94cGxvdChoZWFydF9kaXNlYXNlJGNob2wgfiBoZWFydF9kaXNlYXNlJGRpYWdub3NpcywgeGxhYiA9ICJEaWFnbm9zaXMiLCB5bGFiID0gIkNob2wiKQpib3hwbG90KGhlYXJ0X2Rpc2Vhc2UkdGhhbGFjaCB+IGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzLCB4bGFiID0gIkRpYWdub3NpcyIsIHlsYWIgPSAiVGhhbGFjaCIpCmJveHBsb3QoaGVhcnRfZGlzZWFzZSRvbGRwZWFrfiBoZWFydF9kaXNlYXNlJGRpYWdub3NpcywgeGxhYiA9ICJEaWFnbm9zaXMiLCB5bGFiID0gIk9sZHBlYWsiKQpib3hwbG90KGhlYXJ0X2Rpc2Vhc2UkY2EgfiBoZWFydF9kaXNlYXNlJGRpYWdub3NpcywgeGxhYiA9ICJEaWFnbm9zaXMiLCB5bGFiID0gIkNhIikKCmBgYApgYGB7cn0KIyBQZXJmb3JtaW5nIGEgdC10ZXN0IG5vdyBmb3IgdGhlIGNvbnRpbnVvdXMgdmFyaWFibGVzIG9ubHl3aGljaCBhcmU6IHRyZXN0YnBzLCBjaG9sLCB0aGFsYWNoLCBvbGRwZWFrCgp0cmVzdGJwc190X3Rlc3QgPC0gdC50ZXN0KHRyZXN0YnBzIH4gZGlhZ25vc2lzLCBkYXRhID0gaGVhcnRfZGlzZWFzZSkKY2hvbF90X3Rlc3QgPC0gdC50ZXN0KGNob2x+IGRpYWdub3NpcywgZGF0YSA9IGhlYXJ0X2Rpc2Vhc2UpCnRoYWxhY2hfdF90ZXN0IDwtIHQudGVzdCh0aGFsYWNoIH4gZGlhZ25vc2lzLCBkYXRhID0gaGVhcnRfZGlzZWFzZSkKb2xkcGVha190X3Rlc3QgPC0gdC50ZXN0KG9sZHBlYWsgfiBkaWFnbm9zaXMsIGRhdGEgPSBoZWFydF9kaXNlYXNlKQoKcHJpbnQodHJlc3RicHNfdF90ZXN0KQpwcmludChjaG9sX3RfdGVzdCkKcHJpbnQodGhhbGFjaF90X3Rlc3QpCnByaW50KG9sZHBlYWtfdF90ZXN0KQoKYGBgCgoKIyBBbGwgdGhlIHAtdmFsdWVzIGZvciBhbGwgYXR0cmlidXRlcyBhcmUgY29taW5nIHRvIGJlIGxlc3MgdGhhdCAwLjA1LCB0aHVzLCB3ZSdsbCByZWplY3QgdGhlIG51bGwgaHlwb3RoZXNpcyBhbmQgZm9sbG93IG91ciBhbHRlcm5hdGUgaHlwb3RoZXNpcyB0aGF0IHRoZXkgYXJlIGFzc29jaWF0ZWQgd2l0aCB0aGUgZGlhZ25vc2lzIHZhcmlhYmxlLgpgYGB7cn0KIyBXZSdsbCBwZXJmb3JtIEtydXNrYWwtV2FsbGlzIHRlc3QgZm9yIG9yZGluYWwgbnVtZXJpY2FsIHZhbHVlczogcmVzdGNnLCBzbG9wZSwgdGhhbApyZXN0ZWNnX2t0IDwtIGtydXNrYWwudGVzdChyZXN0ZWNnIH4gZGlhZ25vc2lzLCBkYXRhID0gaGVhcnRfZGlzZWFzZSkKc2xvcGVfa3QgPC0ga3J1c2thbC50ZXN0KHNsb3BlIH4gZGlhZ25vc2lzLCBkYXRhID0gaGVhcnRfZGlzZWFzZSkKdGhhbF9rdCA8LSBrcnVza2FsLnRlc3QodGhhbCB+IGRpYWdub3NpcywgZGF0YSA9IGhlYXJ0X2Rpc2Vhc2UpCnByaW50KHJlc3RlY2dfa3QpCnByaW50KHNsb3BlX2t0KQpwcmludCh0aGFsX2t0KQpgYGAKQWdhaW4sIHRoZSBwLXZhbHVlcyBhcmUgbGVzcyB0aGFuIDAuMDUuIFRoZXJlZm9yZSwgd2UnbGwgcmVqZWN0IG91ciBudWxsIGh5cG90aGVzaXMgYW5kIGNhbiBzYXkgdGhhdCB0aGVpciBpcyBhbiBhc3NvY2lhdGlvbi4KCgpgYGB7cn0KIyBOb3csIHdlJ2xsIHVzZSBtb3NhaWMgcGxvdHMgYW5kIGNoaS1zcWF1cmUgdGVzdCBmb3IgdGhlIGNhdGVnb3JpY2FsIHZhbHVlcwojIFRoZSBjYXRlZ29yaWNhbCB2YWx1ZXMgYXJlOiBzZXgsIGNwLCBmYnMsIHJlc3RjZywgZXhhbmcsIHNsb3BlLCB0aGFsCiMgQ29uc3RydWN0aW5nIG1vc2FpYyBwbG90cwptb3NhaWNwbG90KHRhYmxlKGhlYXJ0X2Rpc2Vhc2Ukc2V4LCBoZWFydF9kaXNlYXNlJGRpYWdub3NpcyksIG1haW4gPSAiTW9zYWljIFBsb3QiLCBjb2xvciA9IGMoImxpZ2h0Ymx1ZSIsICJwaW5rIikpCm1vc2FpY3Bsb3QodGFibGUoaGVhcnRfZGlzZWFzZSRjcCwgaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMpLCBtYWluID0gIk1vc2FpYyBQbG90IiwgY29sb3IgPSBjKCJsaWdodGJsdWUiLCAicGluayIpKQptb3NhaWNwbG90KHRhYmxlKGhlYXJ0X2Rpc2Vhc2UkZmJzLCBoZWFydF9kaXNlYXNlJGRpYWdub3NpcyksIG1haW4gPSAiTW9zYWljIFBsb3QiLCBjb2xvciA9IGMoImxpZ2h0Ymx1ZSIsICJwaW5rIikpCm1vc2FpY3Bsb3QodGFibGUoaGVhcnRfZGlzZWFzZSRyZXN0ZWNnLCBoZWFydF9kaXNlYXNlJGRpYWdub3NpcyksIG1haW4gPSAiTW9zYWljIFBsb3QiLCBjb2xvciA9IGMoImxpZ2h0Ymx1ZSIsICJwaW5rIikpCm1vc2FpY3Bsb3QodGFibGUoaGVhcnRfZGlzZWFzZSRleGFuZywgaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMpLCBtYWluID0gIk1vc2FpYyBQbG90IiwgY29sb3IgPSBjKCJsaWdodGJsdWUiLCAicGluayIpKQptb3NhaWNwbG90KHRhYmxlKGhlYXJ0X2Rpc2Vhc2Ukc2xvcGUsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSwgbWFpbiA9ICJNb3NhaWMgUGxvdCIsIGNvbG9yID0gYygibGlnaHRibHVlIiwgInBpbmsiKSkKbW9zYWljcGxvdCh0YWJsZShoZWFydF9kaXNlYXNlJHRoYWwsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSwgbWFpbiA9ICJNb3NhaWMgUGxvdCIsIGNvbG9yID0gYygibGlnaHRibHVlIiwgInBpbmsiKSkKCmBgYAoKCmBgYHtyfQojIE5vdywgd2UnbGwgdXNlIGNoaS1zcXVhcmUgdGVzdApzZXhfY2hpc3EgPC0gY2hpc3EudGVzdCh0YWJsZShoZWFydF9kaXNlYXNlJHNleCwgaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMpKQpjcF9jaGlzcSA8LSBjaGlzcS50ZXN0KHRhYmxlKGhlYXJ0X2Rpc2Vhc2UkY3AsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSkKZmJzX2NoaXNxIDwtIGNoaXNxLnRlc3QodGFibGUoaGVhcnRfZGlzZWFzZSRmYnMsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSkKcmVzdGVjZ19jaGlzcSA8LSBjaGlzcS50ZXN0KHRhYmxlKGhlYXJ0X2Rpc2Vhc2UkcmVzdGVjZywgaGVhcnRfZGlzZWFzZSRkaWFnbm9zaXMpKQpleGFuZ19jaGlzcSA8LSBjaGlzcS50ZXN0KHRhYmxlKGhlYXJ0X2Rpc2Vhc2UkZXhhbmcsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSkKc2xvcGVfY2hpc3EgPC0gY2hpc3EudGVzdCh0YWJsZShoZWFydF9kaXNlYXNlJHNsb3BlLCBoZWFydF9kaXNlYXNlJGRpYWdub3NpcykpCnRoYWxfY2hpc3EgPC0gY2hpc3EudGVzdCh0YWJsZShoZWFydF9kaXNlYXNlJHRoYWwsIGhlYXJ0X2Rpc2Vhc2UkZGlhZ25vc2lzKSkKcHJpbnQoc2V4X2NoaXNxKQpwcmludChjcF9jaGlzcSkKcHJpbnQoZmJzX2NoaXNxKQpwcmludChyZXN0ZWNnX2NoaXNxKQpwcmludChleGFuZ19jaGlzcSkKcHJpbnQoc2xvcGVfY2hpc3EpCnByaW50KHRoYWxfY2hpc3EgKQpgYGAKCgpBY2NvcmRpbmcgdG8gdGhlIHAtdmFsdWVzIGNvbWluZywgV2UgY2FuIGRlZHVjZSB0aGF0IHdlIHNob3VsZCBhY2NlcHQgdGhlIGFsdGVybmF0ZSBoeXBvdGhlc2lzLgoKCgoKCgoKCgoKCgo=